Data Mining in Complex Networks: Missing Link Prediction and Fuzzy Communities
نویسنده
چکیده
This dissertation is devoted to networks: complex interconnected systems where the individual components are connected by binary links arranged in seemingly random but intrinsically structured patterns. Networks are used to model various real-world phenomena ranging from protein interaction in living organisms to the large-scale organisation of human society or the structure of technological networks such as software systems or the Internet. The first part of the dissertation studies a stochastic graph model, which can be considered as a possible extension of Erdős–Rényi random graphs. I discuss some basic statistical properties of the model and devise methods to find the best fit of the model to a given network instance. I also demonstrate how the fitted model can be used to predict previously unknown connections in the network. The second part of the dissertation studies overlapping communities (i.e., dense subgraphs) in sparse networks. I introduce a method based on the concept of fuzzy partition matrices and vertex similarity to uncover meaningful communities with possible overlaps and to identify bridge vertices that belong to more than one community significantly. Finally, I present applications of the link prediction and community detection methods on real-world datasets.
منابع مشابه
Correlations between Community Structure and Link Formation in Complex Networks
BACKGROUND Links in complex networks commonly represent specific ties between pairs of nodes, such as protein-protein interactions in biological networks or friendships in social networks. However, understanding the mechanism of link formation in complex networks is a long standing challenge for network analysis and data mining. METHODOLOGY/PRINCIPAL FINDINGS Links in complex networks have a ...
متن کاملLink Prediction using Network Embedding based on Global Similarity
Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...
متن کاملPrediction-Based Portfolio Optimization Model for Iran’s Oil Dependent Stocks Using Data Mining Methods
This study applied a prediction-based portfolio optimization model to explore the results of portfolio predicament in the Tehran Stock Exchange. To this aim, first, the data mining approach was used to predict the petroleum products and chemical industry using clustering stock market data. Then, some effective factors, such as crude oil price, exchange rate, global interest rate, gold price, an...
متن کاملMining Overlapping Communities in Real-world Networks Based on Extended Modularity Gain
Detecting communities plays a vital role in studying group level patterns of a social network and it can be helpful in developing several recommendation systems such as movie recommendation, book recommendation, friend recommendation and so on. Most of the community detection algorithms can detect disjoint communities only, but in the real time scenario, a node can be a member of more than one ...
متن کاملHierarchical Alpha-cut Fuzzy C-means, Fuzzy ARTMAP and Cox Regression Model for Customer Churn Prediction
As customers are the main asset of any organization, customer churn management is becoming a major task for organizations to retain their valuable customers. In the previous studies, the applicability and efficiency of hierarchical data mining techniques for churn prediction by combining two or more techniques have been proved to provide better performances than many single techniques over a nu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008